18.05 Lecture 29 
April 25, 2005 



Score distribution for Test 2: 
70-100 A, 40-70 B, 20-40 C, 10-20 D 
Average = 45 

Hypotheses Testing. 

X\, ...,X n with unknown distribution P 
Hypothesis possibilities: 
iii : P = Pi 

H 2 : P = P 2 

H k : P = P fe 

There are k simple hypotheses. 

A simple hypothesis states that the distribution is equal to a particular probability distribution. 
Consider two normal distributions: N(0, 1), and N(l, 1). 




X l 1 

There is only 1 point of data: X\ 

Depending on where the point is, it is more likely to come from either N(0, 1) or N(l, 1). 

Hypothesis testing is similar to maximum likelihood testing — > 

Within your k choices, pick the most likely distribution given the data. 

However, hypothesis testing is NOT like estimation theory, as there is a different goal: 

Definition: Error of type i 

P(make a mistake |Hj is true) = on 

Decision Rule: 5 : X n -» (H u H 2 , H k ) 

Given a sample (Xi, X n ), S(X 1 , ...,X n ) e {Hi, ...,H k } 

OLi = ¥(S Hi\Hi) - error of type i 

"The decision rule picks the wrong hypothesis" = error. 

Example: Medical test, Hi - positive, H 2 - negative. 

Error of Type 1: ai — P(S ^ H\\H{) = ¥(negative\positive) 

Error of Type 2: a 2 = V(8 ^ H 2 \H 2 ) = ¥(positive\negative) 

These are very different errors, have different severity based on the particular situation. 
Example: Missile Detection vs. Airplane 

Type 1 — > P(airplane\missile) , Type 2 — * P(missile\airplane) 
Very different consequences based on the error made. 

Bayes Decision Rules 

Choose a prior distribution on the hypothesis. 
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Assign a weight to each hypothesis, based upon the importance of the different errors. 
£(l),...,£(fc)>0,££(i) = l 

Bayes error a(£) = £(l)a;i + £(2)a 2 + ••• + i{k)ak 

Minimize the Bayes error, choose the appropriate decision rule. 

Simple solution to finding the decision rule: 

X= (X u ...,X n ), let fi(x) beap.f. or p.d.f. of Pj 

/i(x) = /»(a;i) x ... x /i(a; n ) - joint p.f./p.d.f. 

Theorem: Bayes Decision Rule: 

5 = {Hi \ £{i)fi(x) = maxi<j<k£{j)fj{x) 

Similar to max. likelihood. 

Find the largest of joint densities, but weighted in this case. 

«(0 = E£(W ^ ffi) = E - = = 

= 1 - E WW = H i) = 1 - E / = Hi)/i(x)dx = 

= 1 — J(E £(*K(<H X ) = Hi)fi(pt))dx - minimize, so maximize the integral: 

Function within the integral: 

I(S = (x) + ... + 7(5 = H k )Z(k)fk(x) 

The indicators pick the term — ► 

£ = /fi : ie(l)/i(x)+0 + + ... + 

So, just choose the largest term to maximize the integral. 
Let S pick the largest term in the sum. 

Most of the time, we will consider 2 simple hypotheses: 

S = {H 1 : £(l)/i(x) > £(2)/ 2 (x), £M > g^ ;j ff 2 if <]Hl or 7f 2 if =} 

Example: 

#i : N(0,1),H 2 : JV(1,1) 
e(l)/i(x) +C(2)/ 2 (x) - minimize 



A(x) = ( -1= )"e-*£*?. /2(x) = (-i=r e -^(^-i) 2 



/ 2 (x) > m) 
{Hi : < £ - log |^;i72 if >;#i or 7f 2 if =} 



2 "£(1) 

Considering the earlier example, N(0, 1) and N(l, 1) 




<s<s 



X 1; n=U(l)=£(2) = ± 

5 = {H 1 :x 1 < ^;H 2Xl > ^;H X or H 2 if =} 
However, if 1 distribution were more important, it would be weighted. 




1/2 1 

If N(0, 1) were more important, you would choose it more of the time, even on 
some occasions when Xi> \ 

Definition: H\,H 2 - two simple hypotheses, then: 
a\(8) = P(<5 ^ H\\H 2 ) - level of significance. 
13(5) = l-a 2 (5) = V(8 = H 2 \H 2 ) - power. 
For more than 2 hypotheses, 

a 1 (5) is always the level of significance, because Hi is always the 
Most Important hypothesis. 

(3(5) becomes a power function, with respect to each extra hypothesis. 
Definition: Ho - null hypothesis 

Example, when a drug company evaluates a new drug, 
the null hypothesis is that it doesn't work. 
H is what you want to disprove first and foremost, 
you don't want to make that error! 

Next time: consider class of decision rules. 
K a = {5: ai (5) <a},ae [0,1] 
Minimize a 2 (8) within the class K a 

** End of Lecture 29 
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